Evaluating the Scalability of Data Mining Provider Classifiers
نویسندگان
چکیده
Two classifiers implemented as Data Mining Providers are considered. These providers runs as a stand-alone servers or aggregated with Microsoft® SQL Server. One of these classifiers is the Microsoft® Decision Trees algorithm. The other is the Simple Naive Bayes incremental classifier, that supports continuous input attributes, multiple discrete predictable attributes and incremental updating of the training data set. The performance study carried out to verify the scalability of the classifiers includes factors of cardinality (number of training cases), number of input attributes, number of states of the input attributes and number of predictable attributes.
منابع مشابه
Pruning Meta-Classifiers in a Distributed Data Mining System
JAM is a powerful and portable agent-based distributed data mining system that employs metalearning techniques to integrate a number of independent classifiers (models) derived in parallel from independent and (possibly) inherently distributed databases. Although meta-learning promotes scalability and accuracy in a simple and straightforward manner, brute force metalearning techniques can resul...
متن کاملPruning Meta-Classifiers in a Distributed Data Mining System CUCS-032-97
JAM is a powerful and portable agent-based distributed data mining system that employs metalearning techniques to integrate a number of independent classifiers (models) derived in parallel from independent and (possibly) inherently distributed databases. Although meta-learning promotes scalability and accuracy in a simple and straightforward manner, brute force meta-learning techniques can resu...
متن کاملPruning Classifiers in a Distributed Meta-Learning System
JAM is a powerful and portable agent-based distributed data mining system that employs meta-learning techniques to integrate a number of independent classifiers (concepts) derived in parallel from independent and (possibly) inherently distributed databases. Although metalearning promotes scalability and accuracy in a simple and straightforward manner, brute force meta-learning techniques can re...
متن کاملEfficient Data Mining with Evolutionary Algorithms for Cloud Computing Application
With the rapid development of the internet, the amount of information and data which are produced, are extremely massive. Hence, client will be confused with huge amount of data, and it is difficult to understand which ones are useful. Data mining can overcome this problem. While data mining is using on cloud computing, it is reducing time of processing, energy usage and costs. As the speed of ...
متن کاملEnhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining
This paper presents a data mining application in metabolomics. It aims at building an enhanced machine learning classifier that can be used for diagnosing cachexia syndrome and identifying its involved biomarkers. To achieve this goal, a data-driven analysis is carried out using a public dataset consisting of 1H-NMR metabolite profile. This dataset suffers from the problem of imbalanced classes...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003